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Abstract. The task of verifying the compatibility between interacting web services has tradi- 
tionally been limited to checking the compatibility of the interaction protocol in terms of message 
sequences and the type of data being exchanged. Since web services are developed largely in an 
uncoordinated way, different services often use independently developed ontologies for the same 
domain instead of adhering to a single ontology as standard. In this work we investigate the 
approaches that can be taken by the server to verify the possibility to reach a state with seman- 
tically inconsistent results during the execution of a protocol with a client, if the client ontology 
is published. Often database is used to store the actual data along with the ontologies instead 
of storing the actual data as a part of the ontology description. It is important to observe that 
at the current state of the database the semantic conflict state may not be reached even if the 
verification done by the server indicates the possibility of reaching a conflict state. A relational 
algebra based decision procedure is also developed to incorporate the current state of the client 
and the server databases in the overall verification procedure. 



1 Introduction 

Ontology is regarded as a formal specification of a (usually hierarchical) set of concepts and the relations 
between them. The need for developing intelligent web services that can automatically interact with 
other web services has been one of the primary forces behind recent research towards standardization of 
ontologies of specific domains of interest P, EiSSQ- For example, if several online book stores follow 
the same ontology for the book domain, then it facilitates an intelligent web service to automatically 
search these book stores to find books in a particular category. 

In the context of next generation of web, it is envisaged that intelligent agents will find, combine, 
and act upon information on the web, thereby perform the routine day-to-day jobs independently. 
The protocols that will be used by such intelligent agents to communicate with the semantic web 
services, will play an extremely important role towards materializing the next generation of web. The 
protocol may contain branches which are decisions made on the basis of the previous information 
exchange. Along with defining the information exchange between the client and server in the form of a 
set query-answer, independent actions will be described as a part of the protocol. The action may be 
automatically executed or may need manual intervention for completion, but the information required 
to initiate the action is provided by answer of the previous queries. We present an example of such 
protocol in Section [5] 

When two communicating web services use ontologies, with respect to semantic conflict the following 
scenarios are possible. 

Scenario- 1 : If the web services choose to use the same ontology, there will be no semantic conflict. 
In this paper we observe that the requirement that the ontologies used by communicating web 
services must match is a very strong requirement which is often not needed in practice. 
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Scenario-2 : If two communicating web services use different ontologies, then they may potentially 
reach a state where there is a semantic conflict/mismatch arising out of the differences between 
their ontologies. For example, suppose the ontologies of web service A and web service B recognize 
the class vehicle and its sub-classes, namely, car, truck and bike. The ontology of A defines color 
as an attribute of class vehicle, where as the ontology of B defines color as an attribute of the 
sub-classes car and bike only. Now suppose A wants to follow the following protocol with B: 
Step-1: Ask B for the registration number of a vehicle which is owned by a given person. 
Step-2: If B finds the registration number, then ask B for the color of the vehicle. 
Several executions of this protocol are possible for different valuations of the data exchanged by 
the protocol. Semantic conflicts arising out of the differences in ontologies may occur in some of 
these cases, but not always. For example: 

— If B does not find the registration number, then Stcp-2 is not executed and there is no semantic 
conflict. 

— If B finds the registration number and the vehicle happens to be a truck, then Step-2 of the 
protocol will lead to a semantic conflict, since in B's ontology, the color attribute is not defined 
for trucks. 

— If B finds the registration number and the vehicle happens to be a car or a bike, then Step-2 
will not lead to a semantic conflict, since in B's ontology, the color attribute is defined for cars 
and bikes. 

If the ontology of A and the protocol is made available to B, then B can formally verify whether 
any execution of the protocol may lead to a semantic conflict and warn A accordingly before the 
actual execution of the protocol begins. 

There has been considerable research in the recent past on matching ontologies and finding out 
semantic conflicts/mismatches among two ontologies @, @, H|- In many cases, two web services 
may have conflicting ontologies, but the protocol between them may avoid the conflict scenarios. 
Consider the scenario where the direction of query-answer is reversed, that is the same sequence of 
queries are made by A and answered by B. Also A makes the query about the color of vehicle only 
if the vehicle is not a truck. In this case the conflict will not be sensitized by the protocol. In other 
words, two agents may not agree on all concepts in their universe, but may still be able to support 
certain protocols as long as they avoid the contentious issues - a fact which is often ignored in 
world politics! Therefore an approach which rules out communication between two services on the 
grounds that their ontologies do not match is too conservative in practice. Since the standardization 
of ontologies and their acceptance in industrial practice seems to be a distant possibility, we believe 
that the verification problem presented in this paper and its solution is very relevant at present. 

Scenario-3 : The ontologies can be visualized as a combination of meta-data and a set of instances. 
Classes, relations and data-types form the meta-data part of the ontology, whereas the individuals 
and the valuations of the attributes are the actual data. It is often the case that the actual data 
is stored in a database, and ontologies are used as a wrapper on top of the databases. Therefore 
the state of the database has to be incorporated, while the server checks whether the protocol can 
possibly reach conflict state. Since the protocol between the client and the server typically have 
branches and the decision for making the next query is dependent on the answer of the current 
query, the conflict that is present at the ontology level may not be sensitized due to the the answers 
generated from the back-end database. We present a relation algebra based decision procedure to 
check whether the conflict, that are present in the ontology level, are actually present with respect 
to the current state of the back-end database. 

Scenario-4 : It is important to observe that the protocol has different runs depending on the instan- 
tiation of the variables that are used in the protocol. Since the conflict may not be sensitized in 
a particular run of the protocol, the server may choose to start the protocol and check the possi- 
bility to get into a conflict after every information exchange. Depending on how the conversation 
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progresses the server may either continue to run protocol, or may terminate the conversation when 
it finds that the conflict is inevitable. 

A preliminary version of this work is published in jsj. In that version we presented the verification 
algorithm for Scenario-2. In this paper we include the algorithms for Scenario-3, i.e. the verification of 
the spuriousness of an ontological conflict with respect to the current state of the back-end database. 
We also show that the same algorithm can be used by the server for Scenario-4. The paper is organized 
as follows. The syntax for describing a protocol is described in Section [2] In Section [3] we present a 
graph based model for representing the ontologies. The proposed formal method for detecting semantic 
conflicts at the ontology level is presented in Section 01 The notion of ontology with database and query 
answering with the back-end database and the algorithm to verify the conflicts at the ontology level in 
the presence of the database are presented in Section[5] Related works are briefly discussed in Section^ 
Finally we present the conclusion in Section [7] 

2 Protocol and Conflict 

In this section we present a formalism similar to SQL for the specification of the protocol. It may 
be noted that other formalisms can also be used to specify a protocol as long as the formalism has 
expressive power similar to the formalism used in this paper. We present two example protocols and 
also describe the notion of the conflict that we have addressed in this paper. 

2.1 Formal Description of the Protocol 

Typically, a protocol consists of a sequence of queries and answers. The query specifies a set of variables 
through "Get" keyword and specifies a set of classes using "from" keyword. The valuations correspond- 
ing to the variable set are generated from those classes. Also an optional "where" keyword is used to 
specify the conditions on the variables. The answer of a query is a tuple of valuations corresponding 
to the variable set specified in the query. The branching is specified using "if-else" statements. 

2.2 Example of Protocol 

[Protocol - 1 :] Consider the protocol shown in Figure [U The protocol depicts a conversation be- 
tween a client and a server over the publication domain. The query of the client is about the author of 
some specific manual. Then the client makes a query to retrieve a book by the author of that manual. 
According to the ontology of the client, 'Proceedings' is a subclass of 'Book' and the client makes the 
next query to retrieve the proceedings by the same author. If the server does not recognize 'Proceed- 
ings ' as a sub class of 'Book ', the query can not be answered by the server due to the mismatch in the 
ontologies. 

[Protocol - 2 :] In Figure [2] we present another protocol that exchanges information about the au- 
tomobile domain. The client makes a query to retrieve a brand which has sold more than a specific 
number of vehicles in a particular year. Then next query is made in the context of the previous query 
to check whether that brand manufacture 'Red Trucks '. According to the ontology of the client the 
color is a property of the vehicle class and therefore all subclasses of vehicle class will have the color 
attribute. However if the server recognizes 'color' as an attribute of some of the sub-classes(suppose 
'car' and 'two-wheeler') instead of as an attribute of the class 'Vehicle' itself, the query can not be 
answered by the server due to the mismatch in the ontology. 
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Client 



Server 



Get(title : tl, author : a, date: dl) 

from Manual 

where tl = 'Manual Name' 

(tl,a,dl) 



Get(title : t2, author : a) 
from Book 



(12, a) 



if (fe! = null) 

Get(title : t3, author : a, date : d2) 
from Book. Proceedings 



(t3,a,d2) 



Fig. 1. Protocol on Publication Domain 



[Protocol - 3 :] In this example we present a protocol of an intelligent agent. Consider the semantic 
web service for an online store. The online store can queried to retrieve the relevant information about 
the available items. Also consider a multi-cuisine restaurant which is a client of that store. Whenever 
the stock of some item, say ii, falls below some level, the intelligent agent that works on behalf of the 
restaurant, searches the availability of i\ by querying the online store. Suppose i\ comes in two qualities, 
qi and 92- The protocol, that is used by that agent to find and buy the item under consideration, is 
presented below using a format similar to pseudo code. Here the buy action is carried out by the agent 
automatically, if the precondition is satisfied. 



Protocol for Buying an Item 



Get the availability i± of quality qi; 
If (ti of quality qi is available) 

Get the price of ii of quality qi; 

If (the price is less than Ci) 
Buy ii of quantity Qi; 

Else 

Inform the Manager of the store; 

Else 

Get the price of i\ of quality ^2; 
If (the price is less than C2) 
Buy ii of quantity Q2; 

Else 

Inform the Manager of the store; 
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Client 



Server 



Get(Brand : 61, ItemsSold : cl,Year : yl) 

from SaleStats 

where (cl > 10000) (yl = 2009) 



(61,01,1/1) 



Get(Brand : 61, Model: mod, Date: dl, Color : col) 

from V ehicle. Truck 

where (dl.year > 2000) {col = 'Red') 



(61, mod, dl, col) 



Fig. 2. Protocol on Automobile Domain 



2.3 Notion of Mismatch between two Ontologies 

We focus on the following two types of mismatch between the client and server ontologies in this paper. 

Specialization Mismatch(Type-l): In this type of incompatibility the client recognizes a class c-i 
as the specialization of another class c\ whereas the server recognizes c-i as the specialization of 
some other class c[. Our first example (Figured]) is an instance of this type. 

Attribute Assignment Mismatch(Type-2): A very common type of incompatibility arises where 
the client and the server both recognize classes c[, . . . ,c' n as the specializations of another class 
ci, but the client associates an attribute a with the super class c\, whereas the server associates 
a with some of the sub classes c[, . . . ,c'j, < i,j < n. Since we view the mismatches from the 
query answering perspective, we use the notion of this conflict from the query perspective. If the 
set of variables that is used in a query q, is not available at server side, we denote that as attribute 
level(Type-2) mismatch. Our first example (Figure [5]) is an instance of this type. 



3 Graph Model of Ontology- 
While describing an ontology using OWL, the class and the attributes(modeled as properties in the 
context of OWL) are used to represent the meta-data. We use a graph based approach to model the 
meta-data that are described as classes and attributes in OWL. While using OWL, the properties are 
used to express the attributes. Therefore we use the term property and attribute interchangeably. We 
define the ontology graph as follows. 

Definition 1. A graph model for an ontology O is Q — (V,E) where, V is the set of vertices and 
E is the set of directed edges. Each node Vi £ V represents a class in the OWL ontology and Vi is 
associated with a property list C(vi) whose elements are the data properties of the class. The directed 
edges can be of the following types 
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Inheritance-Edge : An inheritance-edge e,j £ E from Vi to Vj, where Vi,Vj £ V, if Vj is a sub class 
ofvi. 

Property-Edge : An property-edge £ E from Vi to Vj, where Vi, Vj £ V, ifvj is an object property 
ofvi. 

4 Overview of the Method 

In this section we present the relevant formalisms and present the overall algorithm for solving the 
problem. The variable set and the class set specified in the query q are denoted by S v (q) and S c (q) 
respectively. We present a graph search based structural matching algorithm to check the semantic 
safety of the protocol. 

Definition 2. The specialization sequence a = (c\.ci. ■ ■ ■ .Cfc) in a query q is the sequence of classes 
that are concatenated through the '. ' operator, and for any two consecutive classes Ci and Cj+i in the 
sequence, Ci is the super class of a + \. Therefore the elements of S c (q) can be individual classes or 
specification sequences. 

4.1 Structural Algorithm to Check the Semantic Consistency 



Algorithm 1: Check-Consistency 



input : The Protocol V and the Server Ontology O s 

v^- {}; 

foreach query q in the protocol V do 
foreach element t in S c (q) do 

if t is a specialization sequence then 
ci the first concept of r; 
c t <- FindMatch(O s , ci); 
for i -s— 2 to length(r) do 
c m <— the i th concept of r; 

if any class c' t equivalent to c m is not found as a sub class of c t %n O s then 
! Report Mismatch at c m ; 

else 
i c t <- c' t 
end 
end 

V <— V U property set for ct\ 
else 

/* c is an individual class 

ci «- r; 

c t <- FindMatch(O s , ci); 
V^FU property set for ct; 
end 
end 

if S v (q) C V then 

Report {S v (q) — V} as unmatched variables; 
end 



end 
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Function FindMatch(C ;5 , a 



1 Find the class c t which is equivalent to Cj in O s 

2 if ct is not found in O s then 

3 Report Mismatch at d; 

4 exit; 

5 end 

6 return a; 



4.2 Working Example 

We present a working example to describe how the algorithm works. Consider the protocol shown in 
Figure [TJ We elaborate the steps of applying Algorithm [T] with respect to the fragments of the client 
and server ontologies shown in Figure [3] and Figure 2] respectively. These fragments are taken from the 
benchmark provided by [l(| • The benchmark has one reference ontology and four other real ontologies 
and the domain of these ontologies is bibliographic references. We have used the reference ontology as 
the server ontology and another real ontology named INPJA as the client ontology. We have used a 
pictorial representation which is similar to entity-relationship diagram to show the fragments of the 
ontologies. The classes arc represented by the rounded rectangles and the ovals represent the properties 
of a particular class. The class hierarchy is shown using arrows, that is a sub class is connected to its 
super class by an arrow which is directed towards the sub class. The properties that belong to a 
particular class are connected to the rounded rectangle corresponding to that class through a line. 

Step-1: While applying Algorithm [1] to the server ontology, the individual class 'Manual' is searched 
and since the search is successful, it is checked that the attributes that are associated with class 
'Manual' in the query in the protocol are actually answerable by the server and this check turns 
out to be successful for the ontologies that are presented here. 

Step-2: The next query uses the class 'Book'. Algorithm [TJ performs the consistency checking in the 
way that is similar to the previous query and the check is successful. 

Step-3: The third query uses a specialization sequence 'Book. Proceedings' . Algorithm [1] searches for 
the 'Book ' class in the server ontology and then checks whether 'Proceedings ' is a sub class of 'Book ' 
in the server ontology. Algorithm [TJ reports a failure since in the server ontology 'Proceedings ' is 
not a sub class of 'Book'. 



4.3 Proof of Correctness 

Theorem 1. [Soundness] The mismatches returned by Algorithm^ are correct. 

Proof. Algorithm [1] reports mismatch in three cases. We observe each of the cases as follows. 

Mismatch in individual class: If Algorithm [1] does not find a matching class c which is used in a 
query, a conflict is reported. Since the class is not recognized by the server, it is not possible for 
the server to answer the query. Therefore the outcome of the algorithm is correct. 

Mismatch in specialization sequence: Consider a specialization sequence a = (c\.C2- ■ ■ ■ .Ck) in 
a query q on which Algorithm [1] returns a mismatch. We prove the correctness of the consistency 
checking by induction on the length k of a. 

Basis (k — 1): In this case there is only one class in the specialization sequence and this case falls 
under the case of mismatch in individual classes. 

Inductive Step: Suppose Algorithm [TJ returns the mismatch correctly for specialization sequences 
having length k. We prove that Algorithm [TJ reports the conflicts correctly for the specialization 
sequences having length k + 1. There can be two possible cases. 
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Fig. 3. Fragment of Client Ontology 



a. The conflict is reported for a class that appears in the i location of the sequence, where 
1 < i < k + 1. The reported mismatch is correct according to the inductive hypothesis. 

b. The conflict is reported for the k + 1 th class of the sequence. In this case there exists a matching 
specialization sequence at server ontology up to length k. But Ck+\ is not a sub class of class 
Cfc according to the server ontology. Therefore the conflict reported by Algorithm [1] is correct. 

Mismatch on variables: Suppose the set of variables that are specified by the client is V c in a query 
q corresponding to the class set S c (q) and the failure is reported on some variable in V c . Since 
Algorithm [T] first finds the matches corresponding to the classes in S c (q) and then checks for the 
answerability with respect to the variable set, in this case every class in S c (q) is matched with 
suitable classes in the server side. Now Algorithm[T]reports conflict if there exists any variable that 
is not recognized by the server as an attribute of at least one of the classes that correspond to the 
classes in S c (q). Therefore the reported conflict falls under the Type-2 or attribute level conflict 
category. □ 

Theorem 2. [Completeness] For any protocol V , if there is any mismatch of type-1 or type-2, Algo- 
rithm [7] reports it. 

Proof. This proof is done by construction. For each of the type of the mismatches we show that 
Algorithm [T] uses a sequence of operations through which the mismatch is detected. We present the 
proof for each mismatch type. 

Type-1 Mismatch: Consider a specialization sequence a = (ci.c 2 . ■ ■ ■ .c&) which is used in query q. 
Algorithm [T] starts by finding the class that is equivalent to c\ at the server side. If there is only 
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Fig. 4. Fragment of Server Ontology 



one class in a then Algorithm [T] reports mismatch when the corresponding class is not found in the 
server ontology. When the length of a is greater than 1, Algorithm [T] continues to check whether 
Ci is a subclass of c^+i where 1 < i < k. A mismatch is reported by Algorithm Q] whenever Ci is a 
subclass of Ci + \ for 1 < i < k. Hence if there exists any mismatch in any specialization sequence, 
the algorithm reports it. 

Type-2 Mismatch: Consider a query q made by the client and the set of variables is V c in q. The 
set of classes is denoted by S c (q). We argue that, if there exists a Type-2 mismatch for query q, 
Algorithm [T] reports it. For Type-2 mismatches Algorithm Q] first checks the presence of the equiv- 
alent classes cf in the server ontology and computes the union V s of the attributes corresponding 
to every ef. If there is any variable/s in V c that are not present in V s , a conflict is reported by 
Algorithm [TJ Hence if there exists a Type-2 mismatch for a query, Algorithm [1] reports it. □ 

5 Ontology with Back-end Database 

In this section we describe the two level representation for describing ontologies - using OWL to 
describe the classification and using database to store the instances. This type of representation is 
helpful for describing domains with large number of instances. From the point of view of the instances 
of classes, the classes in an ontology can be categorized as follows. 

a. Classes of Abstract Type - these classes are used for purely the purpose of describing a domain in 
hierarchically. These classes does not have any instances. They act only as the super class of other 
classes. 
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b. Classes with Instances - these classes may act as super class of other classes but they have a 
non-empty set of instances. 

Consider the ontology fragment in Fig. |4l Here Entry, Informal, and Composite are the example of 
abstract classes. On the other hand, Book, Monograph etc. are the example of classes with instances. 
Although Book is a super class of Monograph and Collection, it is possible to have instances of Book 
which are neither Monograph nor Collection. 

While using the two level representation, it is important to keep the database schema consistent 
with the wrapper ontology. A choice of describing the database schema could be maintaining a table 
for each of the non-abstract classes present in the ontology. Alternative ways of describing the database 
are possible, but we use this simplistic representation of the database schema to present the proposed 
algorithm. 

5.1 Query Answering in the Presence of the Database 

When the server side adheres the two layer structure for its ontology, every query in the protocol 
is answered by generating corresponding tuples from the back-end database. In the context of the 
back-end database the occurrences of variables in a protocol, can be categorized into the following 
types. 

Uninstantiated: When a variable is placed in a query for the first time without initialization, it is 
referred to as an uninstantiated occurrence of variable or in short uninstantiated variable. The 
values for the variables are instantiated at the side where the query is evaluated. 

Instantiated: Other than the first occurrence without initialization, all other occurrences of a variable 
is referred to as instantiated occurrence of that variable or in short instantiated variable. At these 
occurrences, the variables are already assigned to some value by the server. These occurrences are 
used for value propagation. 

[Evaluation Semantics of a Query :] The semantics of the evaluation of the query is similar to 
the Conjunctive Datalog. The evaluator of the query tries to assign value to uninstantiated variables 
and forms a tuple which satisfies logical and of the conditions specified in the where clause of the 
query. Same variables in different classes specified in the where clause of the query have to be assigned 
to the same value. 

Consider the protocol presented in Fig. [TJ In Section 14.21 we have shown that the protocol has an 
ontological conflict, when the client and the server uses the ontologies in Fig.[3]and Fig. [4] respectively. 
Consider the fact, that the condition, (ig! = null) may always evaluate false due to the actual data 
that is stored in the database of the server. In that case, the ontological conflict in the last query, 
[Get(title : tS, author: a, date: d2) from Book. Proceedings], will never be sensitized. In other words 
the conflicts at the ontology level may turn out to be spurious. We define the spuriousness of an 
ontological conflict as follows. 

Definition 3. An ontological conflict is spurious, when for all possible correct instantiations of the 
variables, the conflict is not reachable from the start state of the protocol, due to the decisions taken 
at different stages of the protocol. By correct instantiations we mean the instantiations that conform 
to the evaluation semantics defined earlier. 

5.2 Related Formalisms 

Here we present the relevant formalisms for describing the algorithm to check the presence of the 
conflict detected by Algo.[T]at the current state of the server database. 
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Definition 4. The assignable set of values for a variable ip is the set of values that can be assigned to 
ip during the instantiation and it is denoted as AssignableSet ((p). 

Suppose in a protocol V, a query q has variable set v = {ipi, ip n } and concept set C — {C\, C m }. 
Let us also assume that in V all the variables of q are uninstantiated variables. The notion of assignable 
set in the presence of the previously instantiated variables is discussed later. The evaluation of the 
query basically assigns a values to each of the variables in that query. All the variables together form 
a tuple r = (vali,val 2 , ■ ■ ■ , val n ) such that if any variable ip k is common between class Ci and class 
Cj then both the classes have to assign same value to the variable <p k . All such possible tuples that 
can be populated by the evaluator side, form the assignable set of values for v and the assignable set 
for a variable ifi is: 

AssignableSet(ipi) = {val \ 3t £ AssignableSet(v) At = {val\,val2, ...,val n ) A vak — val} 

The dependencies among the variables play an important role for determining the AssignableSet for a 
variable. 

Definition 5. In a query, if some of the variables are previously instantiated, we say that the previously 
instantiated set of variables is constraining the set of values of the uninstantiated variables. Suppose 
in the same query q, among the variables specified in q, ipi, ■ ■ ■ ,ip k are previously instantiated and 
ifk+i,--- ,(f n are the variables that are instantiated by the evaluation of q. We define the constrain 
relation IZc and the ConstrainSet as follows. 

Tic = {{<Pi, fj) | where ifi £ {<p±, • • • , ip k } and ipj £ {<fik+i, 
ConstrainS et(ifi) = {ip k+ i,ip k+2 , • • • , (f„} 

Consider the same query q. The AssignableSet for the set of variables of q is the set of all tuples r = 
(val\,val2, , val n ) such that the following conditions hold. 

— If any variable ip k is placed in more than one concepts, all the concepts assign same values to ip k - 

— (vali £ Ai) A ... A (val k £ A k ), where A\, ■ ■ ■ ,A k are the assignable sets of variable ipi,--- ,(f k 
respectively. 

Definition 6. The RestrictSet for a variable set v is obtained by computing the transitive closure of 
the IZc on v. 

We use the notion of the split operation on the assignable set of values of a variable and it works 
as follows. Let a query, q, consists of concept Ci with a uninstantiated variable ifi, and a previously 
instantiated variable ipj. Suppose a decision is made on the variable ipj. In each branch, the possible 
values of <pj forms a subset of its assignable set. Since the value of ifi is dependent on ipj, in each 
branch the possible values for ifi also forms a subset of the assignable set of ipi. 

Definition 7. The SplitSet for a variable set v is a subset of RestrictSet(v ) and is defined as: 

SplitSet(v) = {(pj | ipj £ RestrictSet(ipi) and <p>j appears in a condition in the path of the protocol 
from the start of the protocol to the query with ontological conflict <pi\ 

Definition 8. Relevant ConditionSet of a variable set v is the set of conditions in true form on the 
variable set v sp u t , which have to be true for reaching the conflicting query. 
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Algorithm for Detecting Spurious Conflicts with respect to the Back-end Databases 



Algorithm 2: Verify the Conflicts on Back-end Database 



1 Initialize a hash table H l ; 

/* In the hash table H* , a set of variables v forms the key, which is mapped to the 

AssignableSet of the variable set v */ 

2 foreach conflicting query q do 

3 v <— The set of instantiated variables specified in q; 

4 if Verify Conflict(v) then 

5 j| Report mismatch on variable v at database level; 
else Report the conflict as spurious; 

end 



Function VcrifyConnict('y) 



1 Vrestrict <— The RestrictSet for the variable set v; 

2 v sp iit <— The SplitSet for the variable set v; 

3 Kestrict «- MakeSets(w restrict ); 

4 Construct a priority queue F of variable sets; 

/* r is ordered according to the order of the instantiations of its variable sets */ 
forall the variable set Vi € v" eatrict do 

Enqueue Vi in _T; 
end 

Table set 5* <- {}; 
while r is not empty do 
u <— Dequeue (_T); 
if ( Verify Conflict (u)) then 

/* The set of possible valuations for u is not empty */ 
t -s— Search H l and return the table containing u ; 
if t £ S l then 
| 5* <- 5* U {£}; 
end 
else 

/* The set of possible valuations for u is empty, so the conflict is spurious */ 
return false; 
end 
end 

Find the query q that instantiates variable set v; 
if v spUt .'= then 

c -s— The RelevantConditionSet on the variable set v sp ut ; 
S Split AssignableSet(S, v S piit, c); 
end 

if S == then 

Report the conflict on v as spurious; 
return false; 
else 

Insert 8 in H* ; 
return true; 
end 
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Function MakeSets(-y) 



1 Initialize set of variable sets v ret = {}; 

2 while v is not empty do 
Find a query q that instantiates some of the variables in v; 
Initialize variable set v temp = {}; 

forall the variable ipi g v and <pi is instantiated by q do 
vi-v- {<pi}; 

Ufa}; 

end 

...ret , „.ret i r ~1 
U l> U {Utemp}; 



3 
4 
5 
6 
7 
8 
9 

10 end 



Function GenerateAssignableSet(q, S 1 ') 



/* Suppose q is made with the concepts Ci,...,C n and ipn, , ipik are the uninstantiated 

variables corresponding to the concept d */ 

1 v <- {ipij | (fij / *}; 

2 if S" == $ then 

/* All the variables of q are uninstantiated */ 

3 Tuple set T <— (Ci xi C 2 Ex] . . . N Cn ) J 

4 else 

/* Some of the variables of q are previously instantiated and t\,...,t m £ S are the 
tuple sets corresponding to those variables */ 

Tuple set T <— (Ci K ft M ... x C n «ti N ... K t TO ); 

6 end 

7 Relational algebra query <— 7r„(T); 

8 Compute q fle! and return the set of tuples; 



Function Split AssignableSct(<5, v sp ut, c) 



/ * Suppose Ci , • • • , Ci G c */ 

1 Relational algebra query q Rel <— <T( cl vc 2 v...vc i )(^); 

2 Compute q fle! and return the set of tuples; 

This algorithm can also be used by the server as the protocol progresses (described as Scenario-4 in 
Section [T]). In that case, the variables in the queries which are already executed, have some value 
assigned to them and those variables will be considered as instantiated by the algorithm. 



5.4 Proof of Correctness 

The proof of correctness of Algo. El is presented below. Algo. [3] verifies the spuriousness of conflicts 
returned by Algo. [T]on the server database. 

Theorem 3. [Soundness] Algorithm^ correctly reports the spuriousness of conflict on the set of vari- 
ables v' , where d' = dU Restricts et(v) and v is the set of previously instantiated variables in a query 
q of protocol V with ontological conflict. 

Proof. The proof is done using induction. We do the induction on the integer parameter n, where n 
is the total number of Verify Conflict function calls done by Algorithm [3] for q. Among the different 
Verify Conflict function calls, first call is done by Algorithm [3] and the others are recursive calls. 

[Basis (n = 1) :] In this case RestrictSet(u) = <j>. In this case if the AssignableSet(v) is Algo.O 
correctly reports the conflict as spurious, otherwise Algo. |3J reports the conflict as not spurious, which 
is correct. 
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[Inductive Step :] We assume that the spuriousness of a conflict reported for the queries with 
ontological conflict in n steps are true. We now prove that the spuriousness of a conflict that is reported 
in (n + 1) steps are correct. Consider the Verify Conflict function call at Algo. |3]and without loss of 
generality, we can assume this function call as the (n + l)*' 1 function call (in the order of returning 
of the function calls). Therefore the other calls are recursive calls done by the Verify Conflict to itself. 
The following two cases are possible. 

a. The conflict may be detected as spurious by some call which is not the (n + l) th call. In this case 
the spuriousness of the conflict is correct by the inductive hypothesis. 

b. The conflict is detected as spurious at the (n + l) th call to Verify Conflict. All other previous 
calls to Verify Conflict add a table to H* and the set of tables are kept in 5*. After that, function 
GenerateAssignableSet is called to compute the assignable set for the set of previously instantiated 
variables v in the query q with ontological conflict. It follows from the description of the function, 
that this function restricts the set of valuations of v by taking the natural join with the valuations 
of variables in RestrictSet(w). Since the conflict is not detected as spurious in the variables in 
RestrictSet(v), when the function detects the conflict as spurious, the statement 5 == is true. 
Therefore in the protocol q is not reachable from the start state of the protocol. □ 

Theorem 4. [Completeness] If there is a spurious conflict on the set of variables v' , where v' — v U 
Restricts et(v) and v is the previously instantiated variable set specified in a query q of protocol V with 
ontological conflict, the algorithm reports it. We do the proof by establishing the contrapositive of the 
statement, i.e. Algorithm^ reports the ontological as not spurious, if q is reachable from the start state 
ofV. 

Proof. Suppose v' = {tpi, f n }. Let the valuations of the variables in v' are (val±, ...,val n ) when 
the conflict in q is not spurious. In this case the conflict may occur in the following way. Consider 
the VerifyConflict function calls made to determine the spuriousness of the ontological conflict in q, 
among which the first call is done by Algo. and the subsequent calls are recursive calls. The conflict 
is detected as not spurious, only if all the recursive calls to VerifyConflict add a table to H l and the 
set of tables are kept in 5*. Since the conflict is determined as not spurious, the statement S is not 
empty. Therefore in V, q is reachable from the start state of the protocol using any instantiation of 
variables belonging to 5. □ 



6 Related Works 

Different aspects of web service interaction have been an active area of research. However most of 
these researches consider the interaction at syntactic level. Foster et. al. addressed the compatibility 
verification of web services in (llj . They adopted a model based approach for checking the compat- 
ibility of web services at different level of abstraction. However the semantics of exchanged data is 
not addressed by the researchers. In [l2| researchers address the interaction among web services which 
is asynchronous in nature and propose a design pattern to help the development of composite web 
services based on asynchronous interaction. Zhao et. al. provides a formal treatment of web service 
choreography in They define a formal model of the of WS-CDL and propose a methodology to for- 
mally verify the correctness of a choreography using the model checker SPIN. In [3] authors proposed 
a formalism for specifying the web service interfaces. They discuss about three kind of constraints 
which can be put by a web service interface. The propositional constraints are imposed by an interface 
by specifying the methods that can be invoked by the clients along with the constraints on the input 
and output parameters (signature constraints). Protocol Constraints specify the temporal requirements 
on the sequence of the method invocations. An algorithm is proposed to check compatibility among 
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the web services based on the mentioned constraints. However all the proposed verification strategies 
work at a syntactic level, without considering the semantics of the exchanged data. 

On the other hand the current research in semantic web is focused towards the standardization 
of the ontology used by the web services with a vision of computers becoming capable of analyzing 
all web data. Semantic matchmaking IH and discovery of semantic web services TfJ 17 , 3] are 
two important research directions in semantic web. The underlying objective of these approaches is 
to compare facts belonging to different ontologies and to evaluate their compatibility. Standards like 
RDF, OWL, WSML etc. are developed for this purpose. 

Ontology plays an important role towards enhancing the integration and interoperability of the 
semantic web services. A significant amount of research has been done towards formalizing the notion 
of conflict between two ontologies. In [6J, authors present a detailed classification of conflicts by distin- 
guishing between conceptualization and explication mismatches. In [l9j authors further generalize the 
notion of conflicts and classify semantic mismatches into language level mismatches and ontology level 
mismatches. Then ontology level mismatches are further classified into conceptualization mismatch and 
explication mismatch. Further research in the same direction [2(| adds few new types of conceptualiza- 
tion mismatches. Researchers in [2l[ present alternative types of conflicts that are primarily relevant to 
OWL based ontologies. However primary focus of these works is towards the interoperability between 
two ontologies - rather than the correctness of the protocol for information exchange with respect to 
the interpretation. 



Ontology mapping primarily focuses on combining multiple heterogeneous ontologies. In 22j au- 
thors address the problem of specifying a mapping between a global and a set of local ontologies. In [23 1 
authors discuss about establishing a mapping between local ontologies. In [24j the problem of ontology 
alignment and automatic merging is addressed. 

Significant amount of research has been done towards the development of the protocol. In [25| 
researchers proposed a methodology for developing protocols in a multi agent environment. They 
extend prepositional dynamic logic to formally specify the protocol and also use an extension of state- 
charts for visual representation. In [26| a step by step procedure is presented for the development 
of web service interaction protocols from the problem definition to the final specification. However 
these approaches are focused towards the development of protocol for multi agent environment. The 
semantics of the exchanged data is not addressed in these works. 

The problem of checking compatibility between two ontologies with respect to a protocol is new 
and to the best of our knowledge there is no prior work on this topic. 



7 Conclusion 

In this paper we addressed the problem of detecting the presence of semantic mismatch where the 
data exchange between two ontologies is defined in terms of a protocol. We believe that the proposed 
methodology will be very helpful for the integration of web services that are developed independently. 
Moreover the future of internet applications lie in exchanging knowledge, where semantic conflict will 
be a major issue. 
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